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Abstract 

The guessing number of a directed graph (digraph), equivalent to the entropy of that digraph, was 
introduced as a direct criterion on the solvability of a network coding instance. This paper makes 
two contributions on the guessing number. First, we introduce an undirected graph on all possible 
configurations of the digraph, referred to as the guessing graph, which encapsulates the essence of 
dependence amongst configurations. We prove that the guessing number of a digraph is equal to the 
logarithm of the independence number of its guessing graph. Therefore, network coding solvability is no 
more a problem on the operations made by each node, but is simplified into a problem on the messages that 
can transit through the network. By studying the guessing graph of a given digraph, and how to combine 
digraphs or alphabets, we are thus able to derive bounds on the guessing number of digraphs. Second, we 
construct specific digraphs with high guessing numbers, yielding network coding instances where a large 
amount of information can transit. We first propose a construction of digraphs with finite parameters 
based on cyclic codes, with guessing number equal to the degree of the generator polynomial. We then 
construct an infinite class of digraphs with arbitrary girth for which the ratio between the linear guessing 
number and the number of vertices tends to one, despite these digraphs being arbitrarily sparse. These 
constructions yield solvable network coding instances with a relatively small number of intermediate 
nodes for which the node operations are known and linear, although these instances are sparse and the 
sources are arbitrarily far from their corresponding sinks. 
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I. Introduction 

Network coding HI is a protocol which outperforms routing for multicast networks by letting the 
intermediate nodes manipulate the packets they receive. In particular, linear network coding |2 ] is optimal 
in the case of one source; however, it is not the case for multiple sources (3J, (U|. Although for large 
dynamic networks, good heuristics such as random linear network coding Q, can be used, for a given 
static network maximizing the amount of information that can be transmitted is fundamental. Solving 
this problem by brute force, i.e. considering all possible operations at all nodes, is computationally 
prohibitive. In this paper, we reduce this problem to finding a maximum independent set in an undirected 
graph determined by the network coding instance. 

Network coding also opens many new questions about network design (see Q, (8[ for examples of 
networks with interesting properties). Clearly, dense graphs with a large number of edges between the 
nodes can transmit a large amount of information; similarly, a small diameter is a good property for 
information transfer; finally, a large number of intermediate nodes between the sources and the sinks 
is preferable. However, in this paper, we introduce classes of networks that are arbitrarily sparse, with 
arbitrarily high diameters, and with a relatively small number of intermediate nodes, yet on which all 
the requested information can be transmitted. Furthermore, for these graphs, the demands of the sinks 
can be satisfied over any alphabet, and linear combinations are sufficient. Therefore, our work provides 
different guidelines on the design of networks which take advantage of network coding. The results in 
this paper are based on the study of the guessing number of digraphs, reviewed below. 

The guessing number of digraphs is a concept introduced in (9J, which connects graph theory, network 
coding, and circuit complexity theory. In |9l it was proved that an instance of network coding with n 
sources and n sinks on an acyclic network (referred to as a multiple unicast network) is solvable over a 
given alphabet if and only if the guessing number of a related digraph is equal to n. Moreover, it is proved 
in [9), iflOl that any network coding instance can be reduced into a multiple unicast network. Therefore, the 
guessing number is a direct criterion on the solvability of network coding. Similarly, the linear guessing 
number evaluates the solvability of a network coding instance by using linear combinations only. By 
determining these two quantities, the performance of linear network coding can then be compared to that 
of general network coding. In [11], the guessing number is also used to disprove a long-standing open 
conjecture on circuit complexity. In |[T2l . the guessing number and linear guessing number of digraphs 
were studied, and bounds on the guessing number of some particular digraphs were derived. 

The guessing number is equal to the entropy of the same digraph Hill , thus tying this quantity with 
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fundamental problems of information theory. For instance, by relying heavily on lfl3Tl . lfl4l and lfT5l . it 
was shown that the entropy of a digraph might not be determined by the use of Shannon inequalities 
alone IfToll . Similarly, the information defect is related to the so-called public entropy IfToll . We would 
like to emphasize that the graph entropy for digraphs considered in this paper is fundamentally different 
to the graph entropy for undirected graph introduced by Korner in |17] (see lfT8l for a review of that 
quantity). 

Let us give a brief description of the guessing game with n players, viewed as vertices on a digraph D, 
and an alphabet of size s. All the players are assigned an element of the alphabet (collectively referred to 
as a configuration), and each player knows the values assigned to all the players in its in-neighborhood. 
It does not, however, know its own value, and the goal of the game is to guess it correctly. Clearly, 
the values cannot all be guessed correctly every time. If the players do not collaborate, the probability 
that all guesses are correct is exactly s~ n . However, the players may elaborate a collaborative strategy 
(referred to as a protocol) which increases the probability of success. For instance, suppose we play 
the game on the clique K n , where each player knows the values assigned to all the other vertices. A 
common strategy could be the following: each player guesses the opposite of the sum (modulo s) of 
all the values it sees. Any configuration whose sum modulo s is zero will be correctly guessed, hence 
raising the success probability to s~ l = s ( n_1 )- n (this is, in fact, optimal). The guessing number is then 
defined as the maximum over all protocols of the gain from the trivial guessing strategy. For instance, 
the guessing number of the clique on n vertices is n — 1. 

Suppose now the players have a helper, whose aim is to make all players guess correctly every time. 
This helper is limited: he or she can only send the same information to all the players. The information 
defect is defined to be the minimum amount of information the helper must send, and it is strongly 
connected to the guessing number. For instance, in K n , the players will be able to infer their own value 
if the helper sends them the sum of all values modulo s. Only one symbol of information is required, 
therefore the information defect of the clique on n vertices is equal to 1. While the guessing number 
g(D, s) represents the amount of information that can be guessed by the players, the information defect 
b(D, s) is the amount of common information the players need to guess correctly. The information defect 
is shown in (H to be equal to the length of a minimal index code induced on the graph D (see lfT9l for 
more on index coding and its relation to network coding). 

This paper has two main contributions. First, we introduce a graph on all the possible configurations 
of a digraph, referred to as the guessing graph, which encapsulates the dependencies amongst fixed 
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configurations of the same protocol. We then show that the guessing number of a digraph is equal to 
the logarithm of the independence number of its guessing graph. The study of the guessing graph then 
yields the following results. 

• Solvability of network coding is no more a problem of determining the appropriate operations at 
each intermediate node. It is now turned into a problem on the possible messages that could be 
transmitted through the network by using network coding, and the operations which transmit these 
messages can then be easily determined. This simplification significantly reduces the search space, 
which only depends on the number of nodes in the graph and on the alphabet size. 

• The problem of solvability of network coding is reduced to a decision problem on the independence 
number of undirected graphs. Although the guessing graph has an exponential number of vertices, it 
has a large automorphism group, which could be taken advantage of. We show that finding maximum 
independent sets on this graph is actually a problem closely related to the design of error-correcting 
codes. This parallels the results in GUI , where it was shown that some classes of network coding 
instances are solvable if and only if codes with certain parameters exist. 

• Using graph theoretic results, we are then able to provide chains of bounds on the guessing number 
of a digraph based on the properties of its guessing graph. For instance, we obtain that for large 
enough alphabets, the guessing number is at least equal to the minimum in-degree of a vertex in 
the digraph, and the fixed configurations attaining this bound form an MDS code. 

• The relationship between the guessing game and public information (or equivalently, between public 
and private entropy) unveiled in ifTTTl is clarified, as we show that the information defect is equal 
to the chromatic number of the guessing graph. This enables us to prove that these problems are 
asymptotically equivalent. 

• The guessing graph is extremely well-behaved when digraphs are combined. We exhibit some types 
of digraph union which do not increase the ratio between the guessing number and the number of 
vertices in the digraph. Also, the guessing graph illustrates the relationships between the guessing 
numbers of the same digraph over different alphabets. We prove that playing the guessing game on 
a digraph over an extension field is equivalent to playing the guessing game on several copies of 
the same digraph linked to one another over the base field. 

We would like to emphasize the fundamental difference between our work and the literature where 
conflicts in networks were represented as adjacent vertices in graphs |2T1 - ||23|| . In the literature, the 
vertices of the different graphs and hypergraphs previously proposed are routes or links amongst nodes 
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or coding functions instead of messages or configurations. Therefore, these do not convert the network 
coding problem into a problem on messages. Indeed, the vertices of the so-called "link graph" in ETI 
are the routes from the inputs to the outputs, and two routes conflict if they intersect. Also, the vertices 
correspond to the cumulative coding functions at each node in 11221 . and the conflicts amongst functions 
are represented via a hypergraph. Moreover, the vertices of the so-called "conflict graph" in [23 ] represent 
a node in the network along with part of its out-neighbors. 

The second contribution is the construction of specific digraphs with high linear guessing numbers, 
thus yielding solvable network coding instances. 

• For a finite number n of source-sink pairs, we introduce a construction of digraphs based on cyclic 
codes, thus tying another link between network coding and error-correcting codes. All the information 
about the digraph, and especially its guessing number, are available from the generator polynomial 
of the code. In particular, the class of digraphs generated by the simplex codes produce network 
coding instances with bottlenecks on the order of log n only. 

• For unbounded parameters, we determine a way of combining two digraphs, referred to as the strong 
product, which takes full advantage of the structure of the two original digraphs in order to yield 
a high guessing number. Using this technique, we construct network coding instances as sparse 
as possible in terms of edges provided the number of edges tends to infinity, where the shortest 
path between a source and the corresponding sink is arbitrarily long, and where the number of 
intermediate nodes is small compared to the number of sources. These instances are solvable over 
any alphabet and linearly solvable over any field. 

The rest of the paper is organized as follows. Section JI] reviews some necessary background on graph 
theory, guessing games, and error-correcting codes. Section UlTI introduces and investigates the properties 
of the guessing graph. In Section UV] we introduce a class of digraphs based on cyclic codes for which 
we determine the binary linear guessing number. Section [V] studies the maximum guessing number of 
digraphs and introduces families of graphs with asymptotically highest guessing numbers. Finally, Section 
[VI] provides some comments and presents some open problems. 

II. Preliminaries 

A. Graphs and digraphs 

An independent set in a graph is a set of vertices where any two vertices are non-adjacent. The 
independence number a(G) of an undirected graph G is the maximum cardinality of an independent set. 
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We also denote the maximum degree and the clique, chromatic, and fractional chromatic numbers of an 
undirected graph G as A(G), uj(G), x(G), and x*(G), respectively (see |[24ll for definitions of these 
parameters). For a connected vertex-transitive graph which is neither an odd cycle nor a complete graph, 
we have ll24l Corollary 7.5.2] 

u;(G) < X *(G) = < X (G) < A(G). 

Also, it was shown in |[25l that for a non-complete K-connected graph on n vertices which is regular 
with degree d, the independence number is lower bounded by 



The chromatic number and the independence number of a vertex-transitive graph are related by (26 
(using the no-homomorphism lemma in 11271 ) 



X (G) < (1 + log a(G)) max = (1 + log a(G))^l. (2) 

^induced Ot{H) a{G) 

We now review four types of products of graphs; all products of two graphs G\ and G2 have V{G\) x 
V{G2) as vertex set. We denote tow adjacent vertices u and v in a graph as u ~ u. 

• First, in the co-normal product G\ © G2, we have (1*1,1*2) ~ (Vl,v 2 ) if and only if u\ ~ v\ or 
U2 ~ V2- We have 

a{G 1 ®G 2 ) = a{G 1 )a{G 2 ). (3) 

• Second, in the lexicographic product (also called composition) G\ • G2, we have (u\, U2) ~ (v±, V2) 
if and only if either u\ = v\ and U2 ~ v%, or u\ ~ v\. Although this product is not commutative, 
we have 

a(Gi ■ G 2 ) = a{G 1 )a{G 2 ). 

• Third, in the strong product G\ Kl G2, we have (u±, U2) ~ (v 1, ^2) if and only if either u\ = v\ and 
«2 ~ V2, or U2 = V2 and u\ ~ v\, or u\ ~ v\ and «2 ~ "^2- 

• Fourth, in the cartesian product GiDG^, we have {u\,U2) ~ (^1,^2) if and only if either u\ = v% 
and U2 ~ V2, or U2 = ^2 and u\ ~ «i. We have 

xCGiDGa) = ma x{ x (G 1 ),x(G 2 )}, 

a(GiDG 2 ) < min{a(Gi)|F(G 2 )|,a(G 2 )|F(Gi)|}. 

Throughout this paper, we shall only consider simple digraphs, which have no loops and no repeated 
edges. However, we do allow edges in both directions between two vertices, referred to as bidirectional 



edges (we shall abuse notations and identify a bidirectional edge with a corresponding undirected edge). In 
other words, the digraphs considered here are of the form D = (V, E), where E C V 2 \{(v, v) : v € V}. 
We shall denote the number of vertices of the digraph as n unless otherwise specified. The adjacency 
matrix of a digraph D on n vertices is the n x n binary matrix such that ay = 1 if and only if 
{vi,Vj) € E(D). For any vertex Vi of D, its in-neighborhood, denoted as N-(vi), is the set of all vertices 
Vj such that (vj,Vi) <G E(D), and its in-degree is the size of its in-neighborhood. We say that a digraph 
is strong if there is a path from any vertex to any other vertex of the digraph. An independent set of 
vertices in a digraph is a set such that no vertex is in the in-neighborhood of another. 

The girth of a digraph is the minimum length of a directed cycle (we consider a bidirectional edge 
as a cycle of length 2). A digraph is acyclic if it has no directed cycles. In this case, there is an order 
of the vertices vq, v\, . . . ,v n -i, referred to as the topological order, for which (v{,Vj) € E(D) only if 
i < j (in particular, vq has in-degree 0). The cardinality of a maximum induced acyclic subgraph of the 
digraph D is denoted as mas(D). It can be easily shown that mas(D) > where A is the maximum 
in-degree of a vertex in D. 

B. Guessing game and guessing number 

We denote the ring Z(s) = {0, 1, . . . , s — 1} or the field GF(s) if s is the power of a prime as [s]. 
A configuration on a digraph D is a map from its vertex set V(D) to [s], which we shall identify with 
its image x = (xq, x\, . . . , x n _i). A protocol V on D is a mapping between its configurations such that 
V(x) is locally defined, i.e. V{x) v = f v (x Vo ,x Vl , . . . ,x Vk _ 1 ), where k = \N_(v)\ and vi € N_(v) for 
all i. For any J C {0, 1, . . . , n — 1}, we refer to the word (xj a ,Xj t , . . . jXj.j,^ where the jjS are sorted 
in increasing order and are all in J as xj. Using this notation, we have V(x) v = f v (xN_(v))- The fixed 
configurations of V are all the configurations x G [s] n such that V(x) = x. The guessing number of D 
is then defined as the logarithm of the maximum number of configurations fixed by a protocol of D: 

g(D,s) = max{log s |Fix(P)|}. 

This definition actually depends on s, and we can also consider the general guessing number g(D) = 
sup s g(D,s). 

A protocol is said to be linear if the local functions are linear: f v (x N _ ^ ) = y v ■ x^_ ^ for some 
y v G GF(s)l JV -^L The fixed configurations of a linear protocol form a linear subspace of GF(s) n . The 
linear guessing number of D is the maximum dimension of the set of fixed configurations of a linear 
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protocol of D: guneai(D, s) = max-pii near {dimFix('P)}. It is shown in |[T2l Theorem 4.3] that the linear 
guessing number is given by 

3linear(A s) = n - min {rk(I n + A)}, (4) 

AeGF(s)»x»,A<A D 

where A < B if and only if ctjj ^ implies bij ^ 0. Clearly, we have g\i near {D, s) < g(D,s) for all 
digraphs D. 

A set of public messages M. is a is a partition of the set of configurations into b pieces of the form 
Fix('Pi), i.e. (Jo<i<fe-i Fix('Pi) = [s] n . The information defect of the digraph D is defined as the logarithm 
of the minimum cardinality of a set of public messages, and is denoted as b(D, s) = min^{log s |-M|}. 
It was shown in ifTTTl that for any digraph Bonn vertices and any s, b(D, s) + g(D, s) > n. We also 
consider the general information defect b(D) = inf s b(D, s). 

C. Relation between guessing games and network coding 

We now review how to convert a multiple unicast problem in network coding to a guessing game. 
Note that any network coding instance can be converted into a multiple unicast without any loss of 
generality |[T0l , iTTTTl . Let ./V be an acyclic network with n sources, n sinks, and some intermediate nodes. 
We suppose that each sink requests an element from an alphabet [s] from a corresponding source. This 
network coding instance is solvable over [s] if all the demands of the sinks can be satisfied at the same 
time. We assume the network instance is given in its circuit representation, where each vertex represents 
a distinct coding function and hence the same message flows every edge coming out of the same vertex. 
This circuit representation has n source nodes, n sink nodes, and m intermediate nodes. By merging each 
source with its corresponding sink node into one vertex, we form the digraph Dn on m + n vertices. In 
general, we have g{DN,s) < n for all s and the original network coding instance is solvable over [s] 
if and only if g(Djy, s) = n iTTTTl . Similarly, we have b{D^, s) > m and the instance is solvable if and 
only if b{D N ,s) = m El- 

Therefore, while network coding considers how the information flows from sources to sinks, the 
guessing game captures the intuitive notion of how much information circulates through the digraph. A 
protocol for the guessing game is equivalent to the network coding operations in the original instance. 
Since all network coding instances can be turned into a guessing game, the guessing game is a fundamental 
problem in information transit in networks. Conversely, if a digraph flonm + n vertices has an acyclic 
induced subgraph M of size m, then the n vertices outside M can be split in two to form the circuit 
representation of a network coding instance with n sources, n sinks, and m intermediate nodes. 

8 



(a) Circuit representation 
Fig. 1. The butterfly network as a guessing game. 



(b) Guessing game 



We illustrate the conversion of a network coding instance to a guessing game for the famous butterfly 
network in Figure Q] below. We shall show the vertices corresponding to the source-sink pairs in bold 
with thick contours henceforth. It is well-known that the butterfly network is solvable over all alphabets 
(by adding the two incoming messages modulo s in z), and conversely it was shown that the clique K% 
has guessing number 2 over any alphabet (and the protocol is simple: all nodes guess minus the sum 
modulo s of their incoming elements). 

D. Error-correcting codes 

The weight of a word x in [s] n is the number of nonzero symbols of x and is denoted as w(x). A 
code of length n over [s] with minimum Hamming distance d is a set of words in [s] n such that any 
two words differ in at least d positions. We denote the maximum cardinality of such a code as A s (n, d). 
The Singleton bound asserts that A s (n, d) < s n ~ d+l , and this bound is achieved by Maximum Distance 
Separable (MDS) codes. MDS codes are known to exist for d G {l,2,n} or when s is the power of 
prime and satisfies either s > n - 1 or s = 2 m , n = 2 m + 2, d G {4, n - 2} (28l Chapter 11, Section 7]. 

A binary (n, k) linear code C is a linear subspace of GF(2) n with dimension k. If C is the row span 
of a matrix G G GF(2) fcxn , we say that G is a generator matrix of C. Moreover, if C is the row space of 
a matrix G' G GF(2) nx ™ of rank k, we say that G' is an extended generator matrix of C. Alternatively, 
if C is the dual space of the row space of a matrix H G GF(2)( n-fc ) xn (resp., H' G GF(2) nxn with rank 
n — k) , we say that H is a parity-check matrix (resp., extended parity-check matrix) of C. By definition, 
we have cH /T = for all c G C. 

A (binary) cyclic code is a linear binary code where all the cyclic shifts of a codeword are also code- 
words. To any vector c = (cq,ci, . . . ,c n _i) G GF(2) n , we associate the polynomial c(x) = Y^=o °i xl - 
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A cyclic code can then be viewed as an ideal in the ring of polynomials modulo x n + 1, where n is the 
length of the code. Therefore, a cyclic code is composed of all the multiples of a generator polynomial 
g(x) of degree n — k, where k is the dimension of the code. A generator matrix for the code is hence 
given by k shifts of g{x). Remark that a polynomial generates a cyclic code of length n if and only if 
it divides x n + 1. 

A constant-weight code is a binary code consisting of codewords with the same Hamming weight. 
They have attracted a large interest; a thorough survey is provided in [29], and various upper bounds are 
derived or reviewed in |[30l . The maximum cardinality of a constant-weight code of length n, weight w, 
and minimum distance 2d (as it is always even) is upper bounded by ( tu _^ +1 ) /{ w ^" d+1 ) 1-^1 1- 

III. The guessing graph of a digraph 
A. Guessing graph, guessing number, and information defect 

In this section, we introduce an undirected graph on all possible configurations of a digraph, where 
an independent set corresponds to a set of fixed configurations of a protocol. As a result, the guessing 
number of the digraph is equivalent to the logarithm of the independence number of the associated graph. 

Definition 1 (Guessing graph of a digraph): For any digraph D on n vertices and any integer s > 2, 
the s-guessing graph of D, denoted as G(D, s), has [s] n as vertex set and two configurations are adjacent 
if and only if there is no protocol for D which fixes them both. 

Proposition Q] below enumerates some properties of the guessing graph. In particular, Property provides 
a concrete and elementary description of the edge set which makes adjacency between two configurations 
easily decidable. 

Proposition 1: The guessing graph G(D,s) of a digraph D on n vertices satisfies the following 
properties: 

1) It has s n vertices. 

2) Its edge set is E = \JI=q Ei(s), where Ei(s) = {{x,y} : x N _( Vi ) = y N _( Vt ),Xi / yj. 

3) It is regular with degree 

d(G(D,s)) = (-l^l-^-ljIV-l^-W-l/l, 

/independent 

where AL (J) is the union of all the in-neighborhoods of vertices in /. 

4) It is vertex-transitive. More particularly, for any adjacent configurations x = (xo,xi, . . . , x n -i),y = 
(yo,Vi, ■ ■ .,2/n-i) G [s] n , we have 
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• x + e ~ ?/ + e for any e G [s} n ; 

• 7r(x) ~ ir(y) for any it G Aut(D); 

. if s is the power of a prime, (A x , Aixi, . . . , A n _ix„_i) ~ (\ y , \ m , A n _iy„_i) for 
any family of nonzero scalars Aj G GF(s). 
Proof: Property [T] follows Definition [2 Let us prove Property |2] Let x, y G Ei{s) for some i and 
let a protocol with local functions f Vi fix x. Then fv t {VN-(vi)) = fv z {%N-(v,)) = x i Ui> hence P does 
not fix y. Conversely, if x,y £ E then any protocol satisfying f Vz ( x N_(v,)) = x i and fv z (yN_(v z )) = Hi 
for all i fixes both x and y. 

Property |4] follows this observation: x ~ y if and only if (x — J/)jv_(m) = and (x — y)j 7^ for some 
i. Since the guessing graph is vertex-transitive it is regular and hence we determine the number of edges 
adjacent to the all-zero configuration 0. By the inclusion-exclusion principle, we have 



d(G(D,s)) = \En{0}\ 



n-l 



^C-i^li&nMl, 



U^( s )n{o} 

i=0 RCV 

where Er = f] v eR Ei, and hence we only have to determine |^n{0}[ for all RCV. The configurations 
y adjacent to satisfy w{yn) = \R\ and Vn_(r) = 0, while yy—N-(R)-R i s arbitrary. If R is not 
independent, R n N^(R) 7^ and the two conditions are contradictory; otherwise R n N^(R) = and 
there are (s — 1)'^' s ra ~l iV -(- R )l~l- R l choices for y. ■ 

The guessing graph of some particular digraphs can be characterized. 

Example 1: The following guessing graphs are easy to determine. 

• The guessing graph of an acyclic digraph is the complete graph. 

• The guessing graph of the clique K n is given by the Hamming graph H(s,n), where two configu- 
rations are adjacent if and only if they are at Hamming distance 1. 

• In the guessing graph of the directed cycle C n , two configurations are adjacent if and only if they 
are at Hamming distance at most n — 1. 

Proof: If D is acyclic, let us sort the vertices in topological order, so that iV_ (vi) C {vq, v\, . . . , Wj-i}. 
Consider two distinct configurations x,y G [s] n , and let I = min{i : xi ^ yi}, then xjv_(„,) = Vn-(vi) 
and {x,y} G £j(s). 

We now determine the guessing graph of the clique K n . We have Ei(s) = {{x, y} : xi 7^ yi, x v _^y = 
yv~{i}} and hence x and y are adjacent if and only if they differ in exactly one coordinate. 

We now consider the cycle C n , whose edge set is given by {(v{, Vi+i mod n) ■ < i < n — 1}. Suppose 
x and y are distinct and non-adjacent, then there exists i such that xi 7^ yi. Since {x,y} Ei(s), we 
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have ^ Ui-i- Applying this recursively, we obtain that all coordinates of x and y must be distinct. 
Conversely, if Xi ^ m for all i, then it is clear that x and y are not adjacent. ■ 

Clearly, a set of fixed configurations of some protocol forms an independent set in the guessing graph. 
Theorem Q] below asserts the converse: any independent set can be fixed by some protocol and hence can 
be viewed as a set of possible transmitted messages on the original network. 

Theorem 1: A set of configurations in [s] n are fixed configurations of some protocol for D if and only 
if they correspond to an independent set in the graph G(D, s), and hence 

g(D,s)=log s a(G(D,s)). 

Moreover, a set of configurations in [s] n are a set of public messages if and only if it forms a coloring 
of the guessing graph G(D, s), and hence 

b(D,s) = \og sX (G(D,s)). 

Proof: By definition, any set of fixed configurations of some protocol form an independent set 
in the guessing graph. Conversely, if {x a } k a Z^ is an independent set of the guessing graph, we shall 
construct a protocol V which fixes all x a configurations. For < i < n — 1, we define the local 
functions V(x) Vt = f v ,( x N„(v t )) as follows: f Vt {x a N r v \) = xf and f Vi (y) = if there is no a such 
that y = x a N , y Note that this is a non-ambiguous assignment, as either x a N ^ x b N ,^ (and 
the assignments are independent) or -\ = x b N -\ and xf = x\ (the same assignment) for all 
a,be {0, l,...,k-l}. 

Finally, since a set of public messages is a partition of [s] n into sets of fixed configurations, it is 
equivalent to a coloring of the guessing graph. ■ 

The guessing numbers of the digraphs mentioned in Example Q] were already determined in ifTTI or 
lfT2l . However, the proof becomes straightforward using Theorem Q] 

Example 2: If D is acyclic, then g(D, s) = and b(D, s) = n for all s. This can be intuitively 
explained as follows: since the digraph has no cycle, no information can circulate around it. Also, the 
clique satisfies g(K n ,s) = n — 1, b(K n ,s) = 1, which means that the n — 1 symbols of information 
received by any vertex can circulate around the digraph. Finally, for the directed cycle we have g(C n , s) = 
1, b(C n , s) = n — 1, since one symbol of information naturally circulates along the cycle. 

In order to illustrate the relevance of this result to network coding, we return to the butterfly network 
example given in Figure [TJ We already showed that it was equivalent to a guessing game on the clique 
K%. Its binary guessing graph, given by the cube H(2, 3), is illustrated in Figure[2] Throughout this paper, 
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(a) Circuit representation 
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(c) Maximum independent set in the 
guessing graph G(K 3 , 2) = H(2, 3) 



Fig. 2. The butterfly network as a maximum independent set problem. 



we shall represent the configurations in rectangular vertices and shall highlight a maximum independent 
set in bold with thick contours. 

B. Results based on the guessing graph 

We now investigate the properties of the guessing graph and thus derive bounds on the guessing 
number and on the information defect of digraphs. We first show in Proposition [2] below that the general 
guessing number and the general information defect of a digraph are equivalent. From a guessing game 
perspective, this shows that the minimum amount of information required to guess everything correctly 
(b(D)) is exactly equal to the amount of information that is not inferred by the players (n — g(D)). 

Proposition 2: For any digraph D, we have b(D) + g{D) = n. 

Proof: The bounds on the chromatic number and the independence number of a vertex transitive 
graph in (0 yield b(D, s) + g(D, s) > n and for s > 3 

b(D,s) < n-g(D,s) + log s (l + g(D,s)logs) 

< n-g(D,s) +log s n + log s logs, 

which asymptotically yields b(D) = n — g(D). ■ 
Remark that the equality b(D, s) + g(D, s) = n may not hold for all digraphs and every s (e.g., the 

undirected pentagon over alphabets with s non-square ifTTlO . However, it does hold for every s for the 

digraphs considered in Examples Q] and |2 

The following proposition gives a lower bound on the guessing number based on the degree of the 

guessing graph, which shall be refined for large alphabets in Proposition [5] 
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Proposition 3: For any non-acyclic digraph D with minimum in-degree 5 and any s, 

g(D, s)>n + log, \ + log, j 1 - ^1 - " ~ | > 5 - log s n. 

Proof: Since the guessing graph is vertex-transitive, its connectivity is at least 2 ( rf + 1 ) by [321] . By 
applying the first inequality in £[]), we easily obtain the first lower bound above. Call this term L; the 
second inequality in © yields L > n-log s (d(G(D,s)) + l). We have d(G{D,s)) = | Ut- Ef » n {°}l» wnere 
\Ei H {0}| = (s — l)s n ~ di ~ 1 as seen in the proof of Proposition [T] and hence d(G(D, s)) < ns n ~ s — 1. 
The second lower bound then follows. ■ 

If H is a spanning subgraph of D, then it is easy to verify that G(H,s) 2 G(D,s), and hence 
g(H,s) < g(D,s). Intuitively, H is obtained from D by removing edges, hence less information can 
circulate. On the other hand, the guessing graph of any induced subgraph can be viewed as a subgraph 
of the guessing graph of D. For any induced subgraph H of D and any e € [s] n_ l^l, we denote the 
subgraph of G(D, s) induced by all configurations satisfying xy-H = e as G(D, s)h + e. 

Lemma 1: For any induced subgraph H of D and any e € [s] n ~'^', we have G(D, s)n + e = G(H, s). 
Proof: Two configurations x, y are adjacent in G(D, s)u + e if and only if there exists v i € H 
such that Xi ^ yi, x N _t Vi \ = VN-ivA- Since xy-H = Vv~H = e, this is equivalent to Xi ^ yi, 
x N-{vi)nH = yN-(v z )nH> an d hence xh and yjj are adjacent in G(H,s). ■ 

Corollary 1: We have log s u(G(D, s)) > mas(D), where mas(D) denotes the maximum size of an 
acyclic induced subgraph of D. 

Proof: Let H be a maximum induced acyclic subgraph of D, then G(D, s)h + e — G(H, s), which 
by Example Q] is a clique on s^' vertices. ■ 

The proof of Corollary Q] actually indicates that the family {G(D,s) H + e} for all e € [ s ] n - mas ( D ) 
forms a partition of the vertex set of G(D, s) into cliques of size s mas ( D )_ 

Proposition @] below combines the results derived above with the graph-theoretic results reviewed in 
Section III-AI 

Proposition 4: For any non-acyclic digraph D and any s > 2, 

Tt 

x —<mas(D)<log s u J (G(D, S )) 

< log s X *(G(A s))=n- log s a(G(D, s))=n- g(D, s) 
<log s x(G(D,s)) = b(D,s) 

< \og s d(G(D, s)) <n-5 + log s n. 
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A code with Hamming distance d can be viewed as an independent set of the graph where two words 
are adjacent if and only if they differ by at most d — 1 coordinates. Therefore, finding a maximum 
code with a prescribed minimum distance can be viewed as finding the maximum independent set of 
this graph. On the other hand, as seen in Proposition [T] whether two configurations are adjacent in the 
guessing graph is completely determined by the coordinates in which they differ. Therefore, determining 
the guessing number of a digraph is a similar problem to that of finding error-correcting codes with 
maximum cardinality. In particular, Example Q] indicates that the guessing number of the clique K n 
(the directed cycle C n , respectively) is equivalent to the maximum cardinality of a code of length n 
with minimum distance 2 (minimum distance n, respectively). Proposition [5] generalizes this property by 
viewing a set of fixed configurations as a code, and by bounding its minimum distance. 

Proposition 5: If D is a digraph with minimum in-degree 5 and girth 7, then 

log s A s (n,n- 5 + 1) < g(D,s) < log s A s (n,j). 

In particular, g(D, s) > 5 for s the power of a prime and either s > n — 1 or s = 2 m , n = 2 m + 2, and 
5 G {4, 2 m } for some m. 

Proof: First, for any two configurations x, y € [s] n adjacent in the guessing graph of D, we have 
(x — y)N_(vi) = for some i, and hence dn(x,y) < n — di < n — 5. Therefore, in any code with 
minimum distance n — 5 + 1, the codewords are not adjacent in the guessing graph, and hence they form 
a set of fixed configurations. 

Conversely, let x, y be two distinct configurations which are not adjacent in the guessing graph, and 
denote I = {vi : Xi ^ yi\ so that x,y S G(D, s)j + xy^j. Suppose / is acyclic, then G(I, s) is a clique 
by Example \T\ and by Lemma \T\ G(D, s)j + xy~i is also a clique, and hence x and y are adjacent in 
G(D,s). This is a contradiction, thus / contains a cycle and its cardinality is no less than the girth of 
D. Therefore, the set of fixed configurations of any protocol is a code with minimum distance at least 7. 

Since any code with minimum Hamming distance n — 5 + 1 forms a set of fixed configurations, using 
an MDS code yields the lower bound g(D, s) > 5 for the mentioned parameter values. ■ 

Proposition [5] implies that for large enough alphabets, the smallest amount 5 of information received 
by any vertex can circulate through the network. 

C. Combining two graphs 

We now investigate how to combine two digraphs Hi and H2 with disjoint vertex sets. We consider 
three different types of digraph union, each leading to a different graph product of their guessing graphs. 
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(C) P2 




(d) G(P 2 ,2) = Ka 



Fig. 3. The digraphs K% and P2 and their guessing graphs. 



We shall illustrate these unions by the following example: H\ = K2 and H2 = P2 illustrated in Figure 

First, the disjoint union of H\ and H2, denoted as Hi U H2, has V{H{) U ^(#2) as vertex set and 
E(H\) U E{H2) as edge set. Its adjacency matrix is hence given by 







In other words, the digraphs are simply placed next to each other, without adding any edges. For any D 
with vertex set V{D) = V(H\) U V{H2), we have D D H\ U H2 and hence the guessing number of the 
disjoint union of H\ and H2 is a lower bound for the guessing number of D. In |[T2l Lemma 3.2], it 
is shown that the (linear) guessing number of the disjoint union of two digraphs is equal to the sum of 
their (linear) guessing numbers. We give an alternate proof below for the nonlinear case by considering 
the guessing graphs. 

Proposition 6: For all digraphs Hi, H2 with disjoint vertex sets and any s > 2, 

G(H 1 UH 2 ,s)^G(H 1 ,s)®G(H 2 ,s), (5) 

where © denotes the co-normal product, and hence g{H\ U H2, s) = g(H\, s) + g(H 2 , s). 

Proof: Let x and y be two configurations on H1UH2, and denote xjj 1 = x l , yn x = y 1 (and similarly 
for H2). They are adjacent in G{H\ U H2, s) if and only if there exists V{ in H\ or in H2 such that 
Xi 7^ yi and x^_i Vi \ = yN_r Vi \. Since the neighborhood of Vi entirely lies in H\ if Vi € H\ (and similarly 
for Hi), this is equivalent to xj / y\, ^ = y l N {Vt) or xj ^ yf, x 2 N _ M = y 2 N {Vz y Therefore, this 
is equivalent to x 1 ~ y 1 in G(H\,s) or x 2 ~ y 2 in G(H 2 ,s), which yields ©. Finally, © gives the 
guessing number of the disjoint union. ■ 



16 



© 




h ) ( fl , ^ 

(a) iCa U P 2 (b) G(iT 2 U P 2 , 2) = ff(2, 2) K A 

Fig. 4. The disjoint union of K2 and P 2 and its guessing graph. 



Example 3: The guessing graph of the disjoint union of K2 and P2 is illustrated in Figure [4] below 
(we represent the configurations in hexadecimal form). Because it is a very dense graph, we only show 
which configurations are adjacent to the all-zero configuration. It is clear that a(G(K2 UP2), 2) = 2 and 
hence g(K 2 UP 2 ,2) = 1. 

As a corollary of Proposition [6l we now give lower bounds on the guessing number of a digraph by 
considering the sum of guessing numbers of its induced subgraphs. We refer to a clique partition as a 
partition of the vertex set of a digraph into r subsets such that the graph induced by each subset forms a 
clique. The clique partition number of a digraph D, denoted as c(D), is the minimum number of subsets 
in any clique partition of D. Then it is easily shown that g\i neSLV (D, s) > n — c(D), which actually refines 
the lower bound in lfl2l Theorem 3.3] for graphs with bidirectional edges. 

We strengthen the result on the guessing number of the disjoint union below by considering the 
unidirectional union of Hi and H2, denoted as H1GH2, and defined to be (V(D), E(D)) with V(D) = 
V(H\) U V(H 2 ) and E(D) = E(H{) U E(H 2 ) U : i G V{H x ),j £ V(H 2 )}. Its adjacency matrix 

is given by 



^h 1 0h 2 







kff 2 



In other words, we make all the possible connections, but only from Hi to H2. 



17 



K + Ai 


A 3 





In, + A 2 



Proposition 7: For all £Ti, i7 2 with disjoint vertex sets and any s > 2, 

G(ffi0fl- 2) s) = G(fTi, s) • G(tf 2 , a), 
where • is the lexicographic product and hence g{H\VJH 2: s) = g(H\,s) + g(H2,s). Also, we have 

^linear — ^linear fllinear 

Proof: The proof for the guessing number is similar to that of Proposition [6l and is hence omitted. 
We hence prove the result for the linear guessing number. For any A < A H q h , we have 

I„ + A = 

where Ai < Ah 1 and A 2 < A# 2 . Therefore, 

rk(I n + A) > rk(I ni + Ai) + rk(I na + A 2 ) (6) 

> min rk(I ni +Ai)+ min rk(I„ 2 +A 2 ), 
Ai<A Hl A 2 <Ajf 2 

and hence gi ineax (HiGH 2 , s) < gi inear ( H u s) + 5 llnear (H 2 , s) by ©. Furthermore, if A 3 = 0, we have 

equality in ((6]) and hence we can easily prove the reverse inequality. ■ 

Example 4: The guessing graph of the unidirectional union of K 2 and P 2 is illustrated in Figure [5] 

below. Because it is a very dense graph, we only show which configurations are adjacent to the all-zero 

configuration. Although it is distinct to the guessing graph of the disjoint union, they both have the same 

independence number. 

Proposition [7] indicates that the edges from H\ to H 2 do not increase the guessing number and can 
hence be omitted. Intuitively, the edges only going in one direction, they do not create any more cycles, 
and hence no more information can circulate through the whole digraph. If we apply this simplification 
recursively, we obtain that the guessing number of a digraph is completely determined by the guessing 
numbers of its strong components. 

Corollary 2: For any digraph D with strong components d for 1 < i < r, we have g(D,s) = 
E[=i 9(Ci, s) and g\ incaT {D, s) = Yn=i 3iinear(Ci, s). Therefore, g(D, s) < n - r. 

Proof: The proof goes by induction on the number r of strong components. The case where r = 1 
is straightforward. Let us assume the result is true for all digraphs with at most r — 1 components and 
consider D with r components. It is well-known that if each component is contracted to a single vertex, 
the resulting digraph, referred to as the condensation of D, is acyclic. In this condensation, there exists a 
vertex with in-degree (without loss, corresponding to the component C\) such that D = C\0H, where 
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H is the subgraph induced by V(D) — V(C\). We then have g(D,s) = g(C±,s) + g{H,s); however, 
since H has r — 1 components C2, . . . , C r , we obtain g(D, s) = g(C\, s) + g(C2, s) + . . . + <?(C r , s). The 
proof is similar for the linear case. Finally, since <?(Cj, s) < |Cj| — 1 for all i, we have ^(D, s) <n — r. 

m 

Finally, the bidirectional union of two digraphs, denoted as H1OH2, is obtained by connecting all 
vertices of H\ to those of H2, and vice versa. We have E(H\\JH2) = E(Hi)UE(H2)U{(ii,i2), («2,*i) : 
i\ G F(i?i),Z2 € F(i?2)}- Its adjacency matrix is given by 



i -H 1 OH 2 



A 



H-2 



1 

Clearly, for any digraph D and any two induced subgraphs H\ and of D with disjoint vertex sets, 
we have D C H1OH2; therefore, the guessing number of the bidirectional union is an upper bound on 
the guessing number of any union of Hi and H2. 

Proposition 8: For any Hi, H2 with disjoint vertex sets and any s > 2, 

G(HiOH 2 ,s) ^ G{Hi,s)DG(H 2 ,s), 

where □ denotes the cartesian product. Therefore, 

b(H l OH 2 ,s) = mzx{b(Hi,s),b(H 2 ,s)}, (7) 
g(H 1 OH 2 ,s) < raia{g(Hi,s) + n 2 ,g{H 2 ,s) +ni}. 
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(a) K2UP2 (b) G{K 2 L)P 2 , 2) = H{2, 2)UK 4 

Fig. 6. The bidirectional union of K2 and P2 and its guessing graph. 



In the linear case, we have g hneav (HiUH 2 , s) = mm{#i inea r(#i, s) + n 2 , linear (#2, s) + ni}. 

Proof: The proof for the general case is similar to that of Proposition [6] and hence omitted. We now 
prove the linear case. Let A < A^q^ such that rk(I n + A) = n — gvmcai{H\0H2, s). Since 



for some Ai < Ah 1 and A 2 < A# 2 , we have rk(I n + A) > max{rk(I n + Ai),rk(I n + A 2 )} > 

max{ni - fflinear(-^l, s), n 2 - 5linear(-ff2, «)}■ 

Conversely, without loss suppose I = n\— g\ineax{Hi, s) > n 2 — gi mca r(H2, s) and let Ai and A 2 satisfy 
rk(Aj) = rtj — gimca,r(Hi) for i = 1,2. We can express Aj as Aj = Bf C», where Bj, Cj G GF(s) lxn '. 



Example 5: The guessing graph of the bidirectional union of i^ 2 and P 2 is depicted in Figure [6] below. 
In this case, we have g(i^ 2 OP 2 , 2) = g(P 2 , 2) + 2 because the optimal protocols are linear. 

Example 6: Consider the following network coding instance, where n sources want to transmit a 
message each via a common bottleneck of m < n nodes (depicted in Figure [7] for n = 3, m = 2). The 
network coding is solvable if and only if the complete bipartite graph K m ^ n has guessing number n. 
Since this digraph can be viewed as the bidirectional union of the empty graphs on n and m vertices, 
its guessing number is upper bounded by m by Proposition [8j Conversely, since it contains m disjoint 




Then the matrix A = (B1, B 2 ) T (Ci, C 2 ) has rank /. 
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(a) Network coding (circuit represen- 



(b) Guessing game 



tation) 



Fig. 7. The bottleneck with n = 3, m = 2. 

cliques K2, its guessing number is lower bounded by m. Therefore, the network coding instance in Figure 
|7] is solvable if and only if m = n, i.e., there is no bottleneck and routing is sufficient. 

D. Combining alphabets 

A network coding instance solvable over [s] is clearly solvable over [s k ] for any k > 2. However, it is 
shown in [33 ] that certain network coding instances can be solvable over an alphabet but not over some 
larger alphabet. In this section, we discover interesting properties of the guessing graphs of the same 
digraph over different alphabets, which yield bounds on and relations amongst the guessing numbers of 
a digraph over different alphabets. First, a set of fixed configurations of a protocol on D over [s] can 
also be viewed as fixed configurations of a protocol over the alphabet [t], for any t > s which yields 



We refine this bound below by showing that the guessing graph on the cartesian product of two 
alphabets is closely related to the guessing graphs on the two initial alphabets. 
Proposition 9: For any digraph D and any s, t > 2 we have 



g(D,t)>g(D,s 



) log* s. 



(8) 



G(D, s)DG(L>, t) C G(D, st) C G(D, s) G(D, t) 



(9) 



and hence 



g(D,s)\ogs + g(D,t) logt 
log s + log t 



< g(D, st) < min 



{ 



g(D, s) log s + n log t g(D, t) log t + n log s 



log s + log t log s + log t 



} 
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Proof: Since the sets [st] and [s] x [t] are isomorphic, we consider two configurations (x s , x'), (y s , y l ) € 
([s] x [t]) n . Suppose they are adjacent in G(D, st); therefore there exists i such that (xf , xj) ^ (yf, y\) and 

( xS N„(v t y x N^)) = (y S N_(v^y f N.(v,))- This is equivalent to x%_ M = y s N _ M and x* N _ (vi) = tf N _ {Vi) 
and (xf ^ yf or x\ ^ yj). It is easy to check that they are adjacent in G(D, s) © G(D, t). Moreover, we 
can similarly prove the other inclusion. ■ 

As a corollary, we obtain that the guessing number over any alphabet can serve as a lower bound for 
the guessing numbers over larger alphabets. 

Corollary 3: For any t > s with m = [log s t] , we have 

. „ . m / \ q(D,s)+mn 

g(D,s)- < g(D,t) < yy \> . 

logs t !og s t 

Proof: By applying Proposition [9] recursively, we obtain g(D, s m+1 ) < 9 ^ D ^ 1 mn , and the upper 

bound follows from ([8]). Also, applying (O recursively yields g(D,s l ) > g(D,s) for all / > 1, which 

combined with © yields the lower bound. ■ 

The result in (|9]) can be interpreted using digraph unions. For any digraph D and any k > 1, we denote 

the digraph k © D, whose vertex set is given by V(k © D) = {v = (v,i) : v € V(D),i € [k]} and 

whose edge set is E(k © D) = {(u, v) : (u,v) € E(D)}. In other words, we take k copies of D and 

make connections between the copies corresponding to the edges in D. Therefore, the in-neighborhood 

of a vertex (v,i) in k © D consists of the k copies of the in-neighborhood of v. In terms of network 

coding, the digraph k © D can be viewed as expanding the instance according to the k symbols in [s] of 

an element of [s k ]. 

Proposition 10: For any D, k, and s, we have G(k © D,s) = G(D,s k ) and hence g(k © D,s) = 
kg(D,s k ). 

The proof is similar to that of Proposition [6] and is hence omitted. Note that for k = 2 and D\ = D2 = 
D, we have Di U D2 Q 2 (B D C D1OD2; hence (O can be viewed as an extension of Proposition [TOl to 
mixed alphabets. Proposition [TOl means that playing the guessing game over extension fields is equivalent 
to playing the guessing game over the base field, but on several copies of the digraph. 

The result in Proposition [10] also implies that 2 © D is the union of two copies of D which, like 
the unidirectional union of Proposition |7J does not improve on the general guessing number of the 
disjoint union. As seen before, the unidirectional union did not add any cycles to the digraph, hence the 
information could not circulate between the two copies of the digraph. On the other hand, the union 2 © D 
does create new cycles, yet the information received by any vertex is redundant as the in-neighborhood 
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Fig. 8. The digraph 2 C3 with guessing number 2. 



of any vertex in 2©L> is simply two copies of its in-neighborhood in D. For instance, the digraph 2(BC^ 
illustrated in Figure [8] has guessing number 2 over any alphabet. 



In this section, for the sake of simplicity we only consider the binary guessing number (i.e., s = 2). 
However, the concepts introduced below can be easily extended to any field. 

A. Digraphs generated by cyclic codes 

We first define a simple linear protocol which takes advantage of all the information incoming at every 
node. 

Definition 2: The parity-check protocol % has the functions H{x) v defined for any v € V as f v (%N-(v)) 
l-x N _r v ), or equivalently f v (x N _ 

By definition, the parity-check protocol is linear, hence its fixed configurations form a linear binary 
code. It is easily shown that it has an extended parity-check matrix given by H' = I n + A T D . Clearly, 
the rows of H' may be linearly dependent, as seen in Example |7] below. Therefore, our aim is to use 
extended parity check matrices with low rank. 

Example 7: Let C3 be the directed cycle on three edges with adjacency matrix 



IV. A CONSTRUCTION OF DIGRAPHS BASED ON CYCLIC CODES 



10 



A D 



1 



V V 
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The resulting matrix H' is given by 



/ 



1 1 



H = 



1 1 



V° 1 V 



which has rank 2. Therefore, the fixed configurations of the parity-check protocol form a (3, 1) binary 
code (the repetition code) whose generator matrix is given by 



Any linear protocol on a digraph D can be viewed as the parity-check protocol on a subgraph of D. 
Therefore, the linear guessing number of D is given by the logarithm of the maximum number of fixed 
configurations of the parity-check protocol over all subgraphs of D. In other words, we do not lose any 
generality by considering the parity-check protocol only instead of any linear protocol. The maximum 
linear guessing number over all digraphs with no bidirectional edges is hence given by the logarithm 
of the maximum number of fixed configurations of the parity-check protocol of all digraphs with no 
bidirectional edges. 

We now reverse the problem, and construct digraphs based on linear codes. Clearly, any collection of 
vectors cq,c\, . . . , c n -\ € GF(2) n where the i-th coordinate of a is equal to 1 would produce a matrix 
of the type I + Ad for some digraph D, and the code would simply be the dual of the span of these 
vectors. Since the properties of the obtained digraph are not easy to determine in general, we focus on 
the class of cyclic codes. 

Definition 3: Let C be an (n, k) binary cyclic code generated by the polynomial g{x). Then the digraph 
generated by C has adjacency matrix I n + H /T , where the rows of H' are the n cyclic shifts of g(x). 
Equivalently, denoting g{x) = Y17=o 9i xl > there is an edge from v a+ i mo d n to v a if and only if = 1 
for all a and i. 

Example 8: Three trivial polynomials generate the following digraphs. 

• The polynomial g(x) = 1 generates the empty graph; 

• g(x) = x+ 1 generates the directed cycle C n (in particular, C3 given in Example [7] is generated by 
the (3, 2) single parity-check code); 

• g(x) = = x 71 ' 1 + x n ~ 2 + . . . + 1 generates the clique K n . 

The generation of the clique can be generalized when n = st is a composite number. Then we have 
x st + 1 = (x s + l)(x(' _1 ) ,s + x^~ 2 ^ s + . . . + x s + 1), hence the rightmost polynomial generates an 




1 
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(st,s(t — 1)) cyclic code, which generates the disjoint union of s cliques of size t each. According to 
our previous results, this digraph has in-degree and out-degree equal to t — 1, while its linear guessing 
number is s(t — 1). This digraph is not connected; however, by adding a cycle C n that connects all the 
vertices, we make the digraph strong, while increasing the in-degree by 1. We thus obtain a class of 
strong regular digraphs on n vertices and in-degree d satisfying (/linear s) > n — 5 for all values of d. 

The properties of digraphs generated by cyclic codes are listed in Theorem [2] below. 

Theorem 2: The digraph Donn vertices generated by C with generator polynomial g(x) = Y17=o $ iX% 
(hence g(x) divides x n + 1) has the following properties. 

1) D is regular with in-degree and out-degree w(g) — 1, where w(g) is the number of non-zero 
coefficients of g(x). 

2) D has no bidirectional edges if and only if gig n -i = for all 1 < i < I . In particular, if 
deg(g) < Tj, then D has no bidirectional edges. 

3) D is a tournament if and only if g^ + = 1 for all 1 < i < n — 1. 

4) If gigj = 1 for some i,j £ {1,2, ... ,n} relatively prime, then D is strong. 

5) The first n — deg(g) vertices induce a maximum acyclic subgraph. 

6) The binary (linear) guessing number of D satisfies g\mea,r(D, 2) = g(D, 2) = deg(g). 

Proof: The matrix obtained by shifting g(x) n times has the following properties. First, g(x) divides 
x n + 1 hence go = 1 and that matrix has ones all over the diagonal, which ensures that it is the adjacency 
matrix of some digraph D. Second, every row and every column has exactly w(g) ones, which yields 
Property [[J. Properties [2]) and [3) are easy to prove. 

Third, if gigj = 1 for some i,j relatively prime, then we have ai + bj = 1 for some a,b G Z, and 
hence a'i + b'j = 1 mod n for < a', b' < n. Therefore, there is a path of length a' + b' from the node 
v e to the node v e+ \ mo d n for all < e < n — 1. By iteration, there is a path between v e and Vf for all 
0<e, f < n — 1 and D is strong. 

Finally, we prove the last two properties simultaneously. It is easy to check that the first n — deg(g) 
induce a maximum acyclic subgraph in reverse topological order. The dimension of a cyclic code is equal 
to n-deg(g), and hence the dimension of its dual is equal to deg(g) and g(D, 2) > gn near (D, 2) > deg(g). 
On the other hand, g(D,2) <n — mas(D) < deg(g) by Proposition 01 implying equalities everywhere. 

■ 

Properties [5) and [6]) naturally imply constructions of solvable network coding instances based on cyclic 
codes, where the first n — deg(g) vertices of the digraph generated by C are the intermediate nodes, 
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Fig. 9. Digraph P7 on 7 vertices generated by x 4 + x' 2 + x + 1 with binary linear guessing number 4. 

while the remaining deg(g) vertices are split into sources and sinks. These instances are solvable over 
GF(2) using the parity-check protocol, and are hence solvable over any alphabet with cardinality equal 
to a power of 2. 

Theorem |2] indicates that a good choice for g{x) has high degree but low weight. We give an example 
of such a polynomial below. 

Example 9: Let n = 7 and consider the digraph Pj generated by g(x) = x 4 + x 2 + x + l and illustrated 
in Figure [9] By Theorem |2j this is a strong and regular tournament, sometimes referred to as a Paley 
tournament. Its binary linear guessing number is deg(g) = 4, and the fixed configurations form the (7, 4) 
Hamming code. 

This construction illustrates the elegance of the guessing game approach to network coding. Indeed, 
the source-intermediate node-sink hierarchy in the network coding instance vanishes and all nodes are 
on the same level, hence yielding more symmetry in the resulting digraph. 

More generally, the generator polynomial of the (2 l — 1,1) simplex code generates a digraph on 
ni = 2 l — 1 vertices, regular with in-degree di = 2 l ~ l — 1, maximum induced subgraph of size mi = I, and 
binary linear guessing number g\ = 2 l — I — 1. Although these digraphs may have bidirectional edges, the 
corresponding network coding instances do not. Therefore, we obtain solvable network coding instances 
where the in-degree is around half the number of vertices, and for which the number of intermediate 
nodes grows as the logarithm of the number of source-sink pairs. 
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B. Digraphs with no bidirectional edges generated by cyclic codes 

So far, we allowed digraphs to have bidirectional edges, which made the search for digraphs with high 
linear guessing numbers quite easy. We are now interested in digraphs with no bidirectional edges. Based 
on Theorem |2j this is equivalent to searching for polynomials g{x) dividing x n + 1 such that gig n -i = 
for all 1 < i < L§J . 

We first give a simple example of such a polynomial. Let n = 3t be a multiple of 3 with t > 3, 
gcd(t,3) = 1, then x 3 + 1 and x* + 1 divide x n + l. In particular, their gcd, given by (x* + l)(x 2 + x + l) = 
x t+2 + x t+1 + x t + x 2 + x + 1, is a valid polynomial with degree t + 2 and weight 6. Therefore, according 
to Theorem |2l the digraph generated by this polynomial has in-degree and out-degree 5 and its linear 
guessing number is § + 2. Moreover, Theorem |2] ensures that this digraph has no bidirectional edges and 
is strong. 

This example is interesting because it designs a class of digraphs with no bidirectional edges for which 
we know the linear guessing number is strictly greater than ^. On the other hand, the lower bound in 
|[T2l Theorem 3.3] is given by the cycle packing index of the digraph, which can be easily shown to be 
upper bounded by ~; therefore, that bound is not tight for these digraphs. 

If n = 2p is even, then xP- 1 + x p - 2 + . . . + 1 is a valid polynomial, which generates a strong 
unidirectional digraph with in-degree p — 1 and whose linear guessing number equal to p — 1. 

Let g{x) be a factor of x t_1 + x l ~ 2 + . . . + 1 = with degree d and weight w. Then for all I > 1, 
x 2 ' 1 + 1 = (x* + l) 2 ' has h(x) = (x + 1)<? 2 ' (x) as factor. The degree of h(x) is clearly 2 l d + 1, while 
the weight of h(x) is 2w, and we have hi = 1. Therefore, this constructs an infinite class of strong 
unidirectional digraphs with 2 l t vertices, in-degree 2w — 1, and binary guessing number 2 l d + 1. 

Our approach was restricted to polynomials g{x) which generate a cyclic code, or equivalently, which 
divide x n + 1. However, any polynomial h(x) where /io = 1, hih n ^i = for all i, and h p = 1 for p 
relatively prime to n generates a regular strong digraph with no bidirectional edges. The polynomial h(x) 
belongs to the code generated by the greatest common divisor of h(x) and x n + 1, therefore the guessing 
number of the digraph generated by h(x) has guessing number lower bounded by deg(gcd(h(x) , x n + 1)). 

Example 10: Let n = 14 and h(x) = x 12 + x 11 + x 10 + x 9 + x 6 + x + 1, then 

gcd(h(x), x 14 + 1) = x 9 + x 8 + x 6 + x 5 + x 4 + x 3 + 1. 

In this case, the polynomial has a lower weight than its gcd, and hence sparser digraphs can be generated 
by considering all polynomials instead of the generator polynomials of cyclic codes only. Nonetheless, 
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considering such general digraphs is not suitable for constructing network coding instances, as the size 
of a maximum induced subgraph in the digraph generated by h(x) is not easily computable: it is at least 
n — deg(/i) = 2; however, it is actually equal to 3. Note that Theorem |2] does not apply to h(x), as it 
does not divide x n + 1, and the guessing number is strictly less than the degree of h(x). 

V. On the maximum guessing number of digraphs 

As seen above, constructing digraphs with high guessing numbers is relatively easy when we allow 
bidirectional edges. The main purpose of this section is to evaluate the maximum guessing number one 
obtains when considering strong digraphs with no bidirectional edges. We are particularly interested in 
the binary linear guessing number of sparse digraphs, which will surprisingly turn out to be sufficient. 
However, for the sake of completeness, we shall state our results as generally as possible, as some ideas 
extend to digraphs with bidirectional edges as well. 

A. Upper bounds on the guessing number 

We begin this section by deriving upper bounds on the (linear) guessing number of digraphs based on 
their parameters, such as the minimum or maximum in-degree. We first remark in Lemma [2] that the gap 
between the guessing number of digraphs and the number of their vertices must grow arbitrarily large. 
This implies that the probability of success in the guessing game on a digraph with no bidirectional 
edges tends to zero when the number of players tends to infinity. This also indicates that in any family 
of solvable network coding instances without any two-hop path between a source and its according sink, 
the number of intermediate nodes must tend to infinity. 

Lemma 2: For any digraph D with no bidirectional edges and any s > 2, we have g(D, s) < n — 
log s ((s-l)n + l). 

Proof: Since D has no bidirectional edges, its girth is at least 3. By Proposition |5J we have g(D, s) < 
log s A s (n,^/) < log s A s (n, 3). Applying the sphere -packing bound A s {n, 3) < ( s _{} n+1 , we obtain the 
desired bound on g(D, s). ■ 
Proposition [TT] below refines this statement for the linear guessing number of sparse digraphs without 
bidirectional edges. 

Proposition 11: For any digraph D on n vertices with no bidirectional edges and with minimum 
and maximum in-degree 5 and A, we have g\mear(D, s) < n — log s (n — S) — 1 and g\i neai (D, s) < 
n — log s (n — A — e) — 2, where e = max <d : ( A A +t 2 | > n\. 
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Proof: We first prove the bound based on the minimum in-degree. Let A < Ad such that / = 
rk(I n + A) = n — <7ii nea r {D, s), and denote B = I n + A. Since D has no bidirectional edges, all the rows 
of B are distinct. We consider the s l vectors in the row space of B. Since the fixed configurations of the 
protocol corresponding to B form a code with minimum distance at least 2 by the proof of Proposition 
[5j s 1 ^ 1 vectors have a zero in coordinate i for any i. However, let j be a column of B with at most 5 + 1 
of ones, i.e. there are at least n — 5 — 1 distinct rows of B with a zero in coordinate j, and accounting 
for the all-zero vector, we obtain s 1 ^ 1 > n — 5. 

We now prove the bound based on the maximum in-degree. The code with extended parity-check 
matrix B has minimum distance at least 3, therefore its dual code (with dimension I = rk(B)) has the 
following property: for any pair of coordinates 0<i<j<n — 1, s l ~ 2 vectors have (0, 0) in these 
coordinates. Let us give a lower bound on the maximum number, taken over all pairs {i, j} of columns, 
of rows of B which have (0, 0) in columns i and j. First, note that if C < B, then the rows with (0, 0) 
in B also have (0, 0) in C. Therefore, without loss, we can assume all the columns of B have weight 
A + 1. The supports of these columns then form a constant-weight code of length n, weight A + 1, and 
cardinality n. As seen in Section III-DI its minimum distance 2d satisfies n < ( A _*^ +2 ) / (A-d+2) anc ^ 
therefore d < e. Let % and j be two columns of B at distance 2d, then the union of their support has 
cardinality A + 1 + d and there are n — A — 1 — d rows of B with (0, 0) in coordinates i and j. Accounting 
for the all-zero vector, there are at least n — A — d such vectors, and hence s l ~ 2 > n — A — d> n — A — e. 

■ 

B. Combining digraphs to increase the guessing number 

In Section [TV] we showed how to construct digraphs with high guessing numbers for finite parameters. 
In this section, we investigate how to combine digraphs in order to generate infinite families of digraphs 
with high guessing numbers. 

Definition 4: The strong product of two digraphs H\ and H2, denoted as Hi MH2 is defined similarly 
to its counterpart for undirected graphs. Its vertex set is the cartesian product V(H\) x V(H2), and there 
is an edge from (ui,U2) to (^1,^2) if and only if either u% = v\ and (u2,i , 2) £ E{H2), or 112 = ^2 and 
(u\,v\) £ E{H\), or (u\,v\) E E{H\) and (1*2,^2) £ EiJIz). Equivalently, the adjacency matrix of the 
strong product is given by 

A-H^H 2 = (In x + AifJ <g> (I„ 2 + A H2 ) - I niri2 , 

where <8> denotes the Kronecker product of matrices. 
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The properties of the strong product are listed in Proposition [12] below. 

Proposition 12: Let H\ and i/ 2 be two digraphs on m and n 2 vertices, respectively. Then their strong 
product Hi M H2 has the following properties: 

• It has n = mn 2 vertices. 

• If H\ and H2 are both strong and without any bidirectional edges, then so is Hi M H2. 

• If Hi and H2 have regular in-degrees and out-degrees, it is regular with in-degree and out-degree 

d(Hi M H 2 ) = (d(Hi) + l)(d(H 2 ) + 1) - 1. 

• Its linear guessing number satisfies g\ iriCair (HiMH 2 , s) > n-(ni-gn neai {Hi, s))(n 2 - linear (#2, s)) 
for all s. 

Proof: The first three properties are easy to verify. We hence prove the lower bound on the linear 
guessing number. Let Aj < An t such that rk(I n . + Aj) = rij — gii nea r {Hi , s) for i = 1,2. Then 
(I ni +Ai)®(I n2 +A 2 ) < (I ni +A Hl )®(In 2 +AH 2 ) = I n + A^^ 2 , which yields g rmcSLI (Hi^H 2 , s) > 
n-rk{(I ni + Ai) (g> (I„ 2 + A 2 )} = n-(n 1 - g hncar (Hi, s)) (n 2 - gimeai(H 2 , s)). ■ 

Example 11: For any k > 1 and / > 3, denote the unidirectional cycle C\ raised to the power of k 
according to the strong product as Cf (for instance, C| is illustrated in Figure [TOl). Then Cf is a strongly 
regular digraph on n/ ^ = l k vertices with in-degree and out-degree di ^ = 2 k — 1 and linear guessing 
number gi^ = l k — (I — l) k . The lower bound on the guessing number follows Proposition [T2l The upper 
bound follows g(C k , s) < n — mas(C k ) in Proposition HI where mas(C k ) = (I — l) k since the vertices 
in {0,1, ... ,1 — 2} k induce an acyclic subgraph. 

This yields the following construction of network coding instance. The vertices in {0,1, ... ,1 — 2} k 
induce an acyclic subgraph, therefore we use them as intermediate nodes. The source and sink nodes 
come from the split of the other l k — (I — l) k vertices of C k . Since the linear guessing number is equal to 
the number of sources, this network coding instance is solvable over any alphabet by linear operations. 

The sequences C k for a fixed / have the following property: the ratio between the guessing number 
over the number of vertices, given by glinc ^^' =1— (^-p)* tends to 1 as k tends to infinity. We remark 
that the convergence could be sped up by considering powers of the digraph P-j depicted in Figure |9j 
thus obtaining a ratio of 1 — for alphabets of cardinality equal to a power of 2, but not necessarily 
for other alphabets. 

A consequence of Proposition @] is that for any family of digraphs with ratio between the guessing 
number and the number of vertices tending to 1, the maximum in-degree must tend to infinity. On the 
other hand, the digraphs C k become more and more sparse as / and k increase, as d^k + 1 = 1 2 , and 
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Fig. 10. The digraph Cf, with linear guessing number 5. 



hence we can easily construct sequences of strong digraphs with regular in-degree on the order of n e 
for any e > 0. In Theorem [3] below, we strengthen this result by constructing strong digraphs with the 
ratio of the guessing number over the number of vertices tending to 1 and in-degree tending to infinity 
as slow as possible. 

Theorem 3: For any I > 3 and any function /(n) of n > 1 tending to infinity, there exists an infinite 
family of strong digraphs Dk on vertices (nondecreasing sequence) with girth / and regular in- 
degree and out-degree dk such that dk < /(n^) for all k > 1 and iim/ c _ i , 00 gl '" ca ^ D '"'^ = 1 for any 
s > 2. 

Proof: For all k, let be the smallest multiple of l k such that f(n) > 2 k for all n > n^. Then 
select mfc = j£ copies of Cf and join them by tying a directed cycle around all the vertices. The 
cycle goes across the different copies as follows. Sort the vertices of Cf in lexicographic order, so that 
(vi,Vi + \) is an edge for all < i < l k — 1 and denote the vertices of the obtained digraph as vf, where 
< a < rrik — 1. The cycle is then formed by edges (i>q, Vq), . . . , {v 1 q~ 2 , v l _1 ) and an edge {v 1 ^ 1 , v®), 
and so on. 

The obtained digraph has vertices and in-degree dk = 2 fe , and hence /(n^) > dk- Furthermore, 
it can be easily shown that this digraph has girth I and satisfies 9li ° G " n ^ t '^ > ^'"""P' > 1 — (^-p)* 
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by Example [TT] which tends to 1. ■ 
Theorem [3] implies that there exist network coding instances with a relatively small number of interme- 
diate nodes, a relatively small number of edges coming in or out each node, and an arbitrarily long path 
between each source and its corresponding sink. These instances are linearly solvable over any alphabet, 
and the operation at each node is known. 

VI. Conclusion and open problems 

In this paper, we proved that the problem of deciding whether a network coding instance was solvable 
reduced to a problem on the independence number of a related undirected graph, referred to as the 
guessing graph. Although we have derived bounds on this independence number, how to efficiently 
compute it remains an open problem. A brute force approach would be computationally infeasible, as 
the maximum independent set problem is NP-hard. Also, algorithms for the maximum independent set 
problem on general graphs are inappropriate, for the size of the guessing graph grows exponentially with 
the number of nodes in the original network coding instance. However, the guessing graph has many 
symmetries (its structure is fixed by the original instance), hence specific algorithms could be devised to 
bound or compute its independence number. The relationships between this problem and coding theory 
is of peculiar interest. In particular, we exhibited classes of network coding instances for which the 
maximum independent set of the guessing graph is given by cyclic codes. 

The second contribution of our paper is the design of a family of digraphs for which the ratio between 
the guessing number and the number of vertices tends to one, although they have a large girth and are 
sparse. This family of digraphs yields a family of solvable network coding instances, for which binary 
linear operations are sufficient. Although we gave necessary and sufficient conditions on the sparsity of 
the graph in terms of edges, the maximum speed of convergence to one of the ratio remains unknown. 
Similarly, the relation between the guessing number and the girth seems an interesting problem for 
network design. 
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